[ROCm] Add initial ROCm PJRT support #7896

mmakevic-amd · 2024-08-21T12:14:16Z

The plugin is using the same PJRT C API implementation as CUDA (just configured for ROCm) - https://github.com/openxla/xla/blob/main/xla/pjrt/c/pjrt_c_api_gpu_internal.cc

mmakevic-amd · 2024-09-05T10:56:20Z

Hi @will-cromar , could you please review? Thanks!

will-cromar

Would it be possible for you to move the plugin to your own repository? The packages in this repository are the ones that we (ie the PyTorch/XLA team) test, release, and maintain ourselves. Additional device support should be maintained out of tree.

You should be able to build the actual plugin directly from OpenXLA without having to create your own bazel workspace (something like bazel build --config=rocm //xla/pjrt/c:pjrt_c_api_gpu_plugin.so). Most of the mess of bazel build config in this repository supports the main torch_xla package build and won't be relevant for the plugin.

The Python parts of the package all look right to me.

We can accept the change in __init__.py since ROCm is a special case. We can also include a link to your plugin in our README.

mmakevic-amd · 2024-09-10T16:23:20Z

Would it be possible for you to move the plugin to your own repository? The packages in this repository are the ones that we (ie the PyTorch/XLA team) test, release, and maintain ourselves. Additional device support should be maintained out of tree.

You should be able to build the actual plugin directly from OpenXLA without having to create your own bazel workspace (something like bazel build --config=rocm //xla/pjrt/c:pjrt_c_api_gpu_plugin.so). Most of the mess of bazel build config in this repository supports the main torch_xla package build and won't be relevant for the plugin.

The Python parts of the package all look right to me.

We can accept the change in __init__.py since ROCm is a special case. We can also include a link to your plugin in our README.

Thanks for the feedback! I will have to further discuss with my team about setting up our own repository, and will reach back once it is up. :)

pytorch-bot bot added the module: rocm label Aug 21, 2024

JackCaoG requested a review from will-cromar August 21, 2024 17:09

will-cromar reviewed Sep 5, 2024

View reviewed changes

mmakevic-amd closed this Sep 10, 2024

mmakevic-amd added 5 commits December 16, 2024 22:25

Add rocm pjrt plugin

5f4b245

Add missing options to .bazelrc

3a93915

Differentiate between rocm and cuda

94fbec5

Add configure_single_process to __init__.py

35955ae

Fix formatting

dc18745

mmakevic-amd reopened this Dec 16, 2024

mmakevic-amd force-pushed the rocm-plugin branch from 6b79b39 to dc18745 Compare December 16, 2024 22:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ROCm] Add initial ROCm PJRT support #7896

[ROCm] Add initial ROCm PJRT support #7896

mmakevic-amd commented Aug 21, 2024

mmakevic-amd commented Sep 5, 2024

will-cromar left a comment

mmakevic-amd commented Sep 10, 2024

[ROCm] Add initial ROCm PJRT support #7896

Are you sure you want to change the base?

[ROCm] Add initial ROCm PJRT support #7896

Conversation

mmakevic-amd commented Aug 21, 2024

mmakevic-amd commented Sep 5, 2024

will-cromar left a comment

Choose a reason for hiding this comment

mmakevic-amd commented Sep 10, 2024